第3章复习 - Representations of Data 数据表示
特点:保留原始数据,直观显示分布形状。
结构:茎(高位)+ 叶(低位)+ 键(说明)
用途:计算四分位数、中位数、众数;背靠背茎叶图用于对比两组数据
关键原则:条形面积与频率成正比
纵轴:频率密度 = 频率 ÷ 组宽
频率多边形:连接各条形顶端中点
组成:下四分位数(Q₁)、中位数(Q₂)、上四分位数(Q₃)
显示:最大值、最小值、异常值
用途:对比两组数据的位置和离散程度
判断公式:数值 > Q₃ + k(Q₃ - Q₁) 或 < Q₁ - k(Q₃ - Q₁)
k值:通常为1.5
处理:数据清洗时移除异常值
判断方法:
偏度公式:\( \frac{3(\text{均值} - \text{中位数})}{\text{标准差}} \)
原则:同时分析位置度量和离散度量
一致性:不可混合使用(如中位数与标准差)
选择:有极端值用中位数和四分位距,无极端值用均值和标准差
The stem and leaf diagram below shows the heights (in cm) of 25 students.
| Stem | Leaf | Key: 15|2 = 152 cm |
|---|---|---|
| 15 | 2 3 5 6 8 | (5) |
| 16 | 0 1 1 3 4 7 8 | (7) |
| 17 | 0 2 2 5 6 9 | (6) |
| 18 | 1 3 4 8 9 | (5) |
| 19 | 0 2 5 | (3) |
a) Draw the stem and leaf diagram.
b) Find Q₁, Q₂, Q₃.
c) Calculate the interquartile range (IQR).
答题区域:
A histogram shows the time (in minutes) spent on homework by 60 students. The table below summarises the data:
| Time (min) | 10 ≤ t < 20 | 20 ≤ t < 30 | 30 ≤ t < 40 | 40 ≤ t < 60 |
|---|---|---|---|---|
| Frequency | 10 | 20 | 15 | 15 |
a) Calculate the frequency density for each class.
b) Draw the histogram and frequency polygon.
答题区域:
The following data show the number of books read by 15 students in a month:
8, 12, 15, 18, 20, 22, 25, 28, 30, 32, 35, 40, 45, 50, 80
a) Find Q₁, Q₂, Q₃ and IQR.
b) Determine if there are any outliers using k = 1.5.
答题区域:
Two box plots show the test scores of students in Class A and Class B.
| Class A | Class B | |
|---|---|---|
| Q₁ | 55 | 60 |
| Q₂ | 65 | 70 |
| Q₃ | 75 | 80 |
| Min | 40 | 50 |
| Max | 85 | 90 |
a) Draw box plots for both classes.
b) Compare the test scores of Class A and Class B in terms of location and spread.
答题区域:
For a data set, the mean is 50, the median is 45, and the standard deviation is 10.
a) Calculate the skewness using the formula \( \frac{3(\text{mean} - \text{median})}{\text{standard deviation}} \).
b) Describe the skewness and justify using the relationship between mean, median, and the skewness formula.
答题区域:
The back to back stem and leaf diagram below shows the number of goals scored by two football teams in a season.
| Team A | Stem | Team B |
|---|---|---|
| 8 6 5 | 0 | 7 9 |
| 9 7 5 3 | 1 | 8 6 4 2 |
| 8 6 4 2 | 2 | 9 7 5 3 |
| 5 3 1 | 3 | 8 6 4 |
| 2 | 4 | 7 5 |
Key: 1|5 = 15 goals
a) Find the median, Q₁, Q₃ for each team.
b) Compare the goal-scoring performance of Team A and Team B.
答题区域:
解答过程:
b) 四分位数计算:总数据量25
c) 四分位距计算:
IQR = Q₃ - Q₁ = 184 - 161 = 23 cm
解答过程:
a) 频率密度计算:频率密度 = 频率 ÷ 组宽
| 组别 | 频率 | 组宽 | 频率密度 |
|---|---|---|---|
| 10-20 | 10 | 10 | 1.0 |
| 20-30 | 20 | 10 | 2.0 |
| 30-40 | 15 | 10 | 1.5 |
| 40-60 | 15 | 20 | 0.75 |
解答过程:
a) 四分位数计算:数据已排序
b) 异常值判断:使用公式 Q₃ + k×IQR 和 Q₁ - k×IQR
解答过程:
b) 数据对比分析:
位置度量(Location):
离散度量(Spread):
解答过程:
a) 偏度计算:
\[ \text{偏度} = \frac{3(\text{mean} - \text{median})}{\text{standard deviation}} = \frac{3(50 - 45)}{10} = \frac{3 \times 5}{10} = 1.5 \]
b) 偏度描述与解释:
解答过程:
a) 各队统计量计算:
Team A数据:5,6,8,13,15,17,19,22,24,26,28,31,33,35,42(共15个)
Team B数据:7,9,12,14,16,18,23,25,27,29,34,36,38,45,47(共15个)
b) 进球表现对比: